Welcome![Sign In][Sign Up]
Location:
Search - Information Retrieval indexing

Search list

[File OperateMSA_multiindexfilesystem

Description: this project is MultiIndexed FileSystem i done it for \"Information storage and retrieval FileSystem course\" it is like IBM VSAM FS, use variable lenght record , and B+Tree algorithm for indexing,and store data in PileFile(*.pile) and also indexes in (*.idx) -this project is MultiIndexed FileSystem i done it for "Information storage and retrieval FileSystem course "it is like IBM VSE FS, use variable length record, and B Tree algorithm for indexing. and store data in PileFile (*. pile) and also ind exes in (*. idx)
Platform: | Size: 20543 | Author: Toby | Hits:

[Search Engine信息检索报告

Description:

Information Retrieval (IR) is the discipline that deals with retrieval of unstructured
data, especially textual documents, in response to a query or topic statement, which
mayitselfbeunstructured,e.g.,asentenceorevenanotherdocument,orwhichmay
be structured, e.g., a boolean expression. The need for effective methods of auto-
mated IR has grown in importance because of the tremendous explosion in the
amount of unstructured data, both internal, corporate document collections, and the
immense and growing number of document sources on the Internet. This report is a
tutorial and survey of the state of the art, both research and commercial, in this
dynamic field. The topics covered include: formulation of structured and unstruc-
tured queries and topic statements, indexing (including term weighting) of docu-
ment collections, methods for computing the similarity of queries and documents,
classification and routing of documents in an incoming stream to users on the basis
of topic or need statements, clustering of document collections on the basis of lan-
guageortopic,andstatistical,probabilistic,andsemanticmethodsofanalyzingand
retrieving documents.


Platform: | Size: 759430 | Author: fuji246 | Hits:

[File OperateMSA_multiindexfilesystem

Description: this project is MultiIndexed FileSystem i done it for "Information storage and retrieval FileSystem course" it is like IBM VSAM FS, use variable lenght record , and B+Tree algorithm for indexing,and store data in PileFile(*.pile) and also indexes in (*.idx) -this project is MultiIndexed FileSystem i done it for "Information storage and retrieval FileSystem course "it is like IBM VSE FS, use variable length record, and B Tree algorithm for indexing. and store data in PileFile (*. pile) and also ind exes in (*. idx)
Platform: | Size: 20480 | Author: Toby | Hits:

[Search Enginemeaning

Description: 关键词信息提取技术效率很低, 潜在语意索引技术是对它的改进. 在分析潜在语意索引技 术的结构与原理的基础上, 探讨了利用它改进汉语信息处理及中西文信息交叉提取的可能性.-Keywords information retrieval of technical efficiency is low, the potential semantic indexing technology is improving it. In analyzing the potential semantic indexing technology structure and principle on the basis of the use of it to improve the Chinese Information Processing of Chinese and Western text information extracted from the possibility of cross- .
Platform: | Size: 331776 | Author: math | Hits:

[Windows Developkeyword

Description: 信息检索是计算机应用的重要领域之一。由于信息检索的主要操作是在大量的存放在磁盘上的信息中查询一个特定的信息,为了提高效率,重要的问题是建立一个好的索引系统。编制程序实现“书名关键词索引”,程序功能为从给定的书目文件生成与其相应的有序词表。如下图所示:表(a)为书目文件,表(b)为与其对应的有序词表。-Computer Application Information Retrieval is one of the key areas. Information Retrieval as a result of the main operation is in a large number of stored information on disk query a specific information, in order to improve efficiency, an important issue is to establish a good indexing system. Programming realize
Platform: | Size: 1024 | Author: 杨哲 | Hits:

[Search EngineAnalyzerViewer_source

Description: Lucene.Net is a high performance Information Retrieval (IR) library, also known as a search engine library. Lucene.Net contains powerful APIs for creating full text indexes and implementing advanced and precise search technologies into your programs. Some people may confuse Lucene.net with a ready to use application like a web search/crawler, or a file search application, but Lucene.Net is not such an application, it s a framework library. Lucene.Net provides a framework for implementing these difficult technologies yourself. Lucene.Net makes no discriminations on what you can index and search, which gives you a lot more power compared to other full text indexing/searching implications you can index anything that can be represented as text. There are also ways to get Lucene.Net to index HTML, Office documents, PDF files, and much more.-Lucene.Net is a high performance Information Retrieval (IR) library, also known as a search engine library. Lucene.Net contains powerful APIs for creating full text indexes and implementing advanced and precise search technologies into your programs. Some people may confuse Lucene.net with a ready to use application like a web search/crawler, or a file search application, but Lucene.Net is not such an application, it s a framework library. Lucene.Net provides a framework for implementing these difficult technologies yourself. Lucene.Net makes no discriminations on what you can index and search, which gives you a lot more power compared to other full text indexing/searching implications you can index anything that can be represented as text. There are also ways to get Lucene.Net to index HTML, Office documents, PDF files, and much more.
Platform: | Size: 320512 | Author: Yu-Chieh Wu | Hits:

[MultiLanguageLucene.Net.Analysis.Cn

Description: Lucene是什么? Lucene是Apache组织的一个用JAVA实现全文搜索引擎的开源项目。后来有人将Lucene移植到。Net语言。 Lucene是一个信息检索的函数库(Library),利用它你可以为你的应用加上索引和搜索的功能。 Lucene的使用者不需要深入了解有关全文检索的知识,仅仅学会使用库中的一个类,你就为你的应用实现全文检索的功能。 不过千万别以为Lucene是一个象google那样的搜索引擎,Lucene甚至不是一个应用程序,它仅仅是一个工具,一个Library。你也可以把它理解为一个将索引、搜索功能封装的很好的一套简单易用的API。利用这套API你可以做很多有关搜索的事情,而且很方便。 Lucene能做什么? Lucene可以对任何的数据做索引和搜索。 Lucene不管数据源是什么格式,只要它能被转化为文字的形式,就可以被Lucene所分析利用。也就是说不管是MS word, Html ,pdf还是其他什么形式的文件只要你可以从中抽取出文字形式的内容就可以被Lucene所用。你就可以用Lucene对它们进行索引以及搜索。 -What is Lucene? Lucene is an Apache with organizations to achieve full-text search engine JAVA open-source project. Later, some people will migrate to Lucene. Net language. Lucene is a library of information retrieval (Library), you can use it for your application together with the indexing and search functions. Lucene users do not need a deeper understanding of the full-text search of knowledge, just learn how to use a class library, you achieve your application for full-text search function. But do not think that Lucene is a kind of like the google search engine, Lucene is not even an application, it is merely a tool, a Library. You can also it will be understood as an index, a good search function of a package easy-to-use API. Use this API you can do a lot of things related to search and easily. Lucene can do? Lucene can do for any of the data and search index. Lucene regardless of what format the data source, as long as it can be transformed into the f
Platform: | Size: 95232 | Author: liutonglai | Hits:

[Database system70

Description: 随着多媒体、网络技术的迅速发展,图像信息的应用日益广泛,对规模越来越大的图像数据库、可视信息进行有效的管理成为迫切需要解决的问题,灵活、高效、准确的图像检索策略是解决这一问题的关键技术之一。因此,基于内容的图像检索已成为国内外学者研究的主要热点问题,并取得了不少的成果。 本文主要对当今热门的基于内容的图像检索技术进行了研究,重点对它的算法进行研究。在半年的时间里,通过查阅很多相关的资料,并认真学习了基于内容的图像检索的基本理论,特别是深入研究了颜色直方图理论和累加直方图算法,最后在MATLAB平台下编程实现此系统,该系统可以实现基本图像检索的功能,根据用户输入的样本图像来与图像库中的图像进行特征匹配,然后找出与样本图像距离比较小的若干幅图像,并按照图像之间的距离由小到大的顺序显示给用户。 经过对该系统进行反复的调试运行后,该系统所实现的功能基本达到了设计目标,并且运行良好。当用户提供出所要查询的关键图后,系统就可以从用户提供的图像库中检索到与关键图相似的图片并排序返回给用户,达到了预期效果。 -With the rapid development of the multimedia and the network technology, the image information becomes more widely available, increasing the size of the image database, visual information for effective management of an urgent need to address the problem, flexible, efficient and accurate image retrieval strategy solve this problem one of the key technologies. The researchers are so keen on Content-Based Image Retrieval that they have made much progress. In this paper, today s popular content-based image retrieval technology is analyzed. And it mainly focuses on the research of its algorithm. In a period of half a year, Through access to relevant information and to seriously study the content-based image retrieval of the basic theory, in particular, in-depth study of the color histogram theory and cumulative histogram algorithm. Finally, this system should be implemented under the platform of the MATLAB by programming. In this system, the basic image retrieval functions can be achieved.
Platform: | Size: 380928 | Author: qichao | Hits:

[OtherContentbasedMultimediaInformationalRetrieval

Description: Extending beyond the boundaries of science, art, and culture, content-based multimedia information retrieval provides new paradigms and methods for searching through the myriad variety of media all over the world. This survey reviews 100&plus recent articles on content-based multimedia information retrieval and discusses their role in current research directions which include browsing and search paradigms, user studies, affective computing, learning, semantic queries, new features and media types, high performance indexing, and evaluation techniques. Based on the current state of the art, we discuss the major challenges for the future.-Extending beyond the boundaries of science, art, and culture, content-based multimedia information retrieval provides new paradigms and methods for searching through the myriad variety of media all over the world. This survey reviews 100&plus recent articles on content-based multimedia information retrieval and discusses their role in current research directions which include browsing and search paradigms, user studies, affective computing, learning, semantic queries, new features and media types, high performance indexing, and evaluation techniques. Based on the current state of the art, we discuss the major challenges for the future.
Platform: | Size: 196608 | Author: 武玉阳 | Hits:

[Internet-Networkxapian-core-1.2.2.tar

Description: Xapian是一个用C++编写的全文检索程序,他的作用类似于Java的lucene。尽管在Java世界lucene已经是标准的全文检索程序,但是C/C++世界并没有相应的工具,而Xapian则填补了这个缺憾。 Xapian的api和检索原理和lucene在很多方面都很相似,但是也有一些地方存在不同,具体请看Xapian自己的文档:http://www.xapian.org/docs/ Xapian除了提供原生的C++编程接口之外,还提供了Perl,PHP,Python和Ruby编程接口和相应的类库,所以你可以直接从自己喜欢的脚本编程语言当中使用Xapian进行全文检索了。-Xapian is an Open Source Search Engine Library, released under the GPL. It s written in C++, with bindings to allow use from Perl, Python, PHP, Java, Tcl, C# and Ruby (so far!) Xapian is a highly adaptable toolkit which allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also supports a rich set of boolean query operators. If you re after a packaged search engine for your website, you should take a look at Omega: an application we supply built upon Xapian. Unlike most other website search solutions, Xapian s versatility allows you to extend Omega to meet your needs as they grow.
Platform: | Size: 3849216 | Author: lijun | Hits:

[Software EngineeringIntroduction-to-Information-Retrieval

Description: Introduction to Information Retrieval is the first textbook with a coherent treatment of classical and web information retrieval, including web search and the related areas of text classification and text clustering. Written from a computer science perspective, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents and of methods for evaluating systems, along with an introduction to the use of machine learning methods on text collections.
Platform: | Size: 3818496 | Author: Sara | Hits:

[Software Engineeringwumpus-2009-02-02

Description: Wumpus is an information retrieval system developed at the University of Waterloo. Its main purpose is to study issues that arise in the context of indexing dynamic text collections in multi-user environments. One particular scenario that we are studying is file system search (aka "desktop search"), in which the underlying text collection is very dynamic and the number of expected index update operations is much greater than the number of search queries submitted by the users of the system.
Platform: | Size: 1400832 | Author: 史宇飞 | Hits:

[Software EngineeringText-Retrieval

Description: 信息检索系统从最初的纯手工检索系统业已发展到现在的以信息技术为支撑的检索系统,在这一过程中,适应新的信息资源、信息技术这些检索环境,提高信息检索系统的查全率、查准率和系统响应时间是不变的主题,在众多文本中掌握最有效的信息始终是信息处理的一大目标。围绕向量空间模型设计了一个文本检索系统,介绍向量空间模型的基础上给出了基于它的信息检索系统的一般结构框架和各部分的功能,探讨了系统中所涉及到的关键技术。用向量空间模型进行特征表达,用TF-IDF(Term-Frequency Inverse-Document-Frequency)进行特征项赋权,用倒排文档进行索引,用余弦夹角进行距离度量,用查全率和查准率评价检索系统性能,并以向量空间模型及相关理论为基础对中文信息检索进行了一些探讨。向量空间模型需要解决特征项的生成和加权、相似度的计算(检索运算)等一系列问题。由于向量检索中采用的向量叫某种距离度量来反映文档的满足程度,所以相似度的值最好能与真实情况相符,计算简便。-Information retrieval system to retrieve from the first hand to the present system has been developed using information technology to support the retrieval system, in the process and adapt to new information resources, information technology, the search environment, improve information retrieval system recall , precision and system response time is the constant theme in many text information is always the most effective control is a major goal of information processing. Vector space model around a text retrieval system is designed to introduce the vector space model is given on the basis of its information retrieval system based on the general framework and functions of each part, of the system, the key technologies involved. The feature vector space model using the expression, with the TF-IDF (Term-Frequency Inverse-Document-Frequency) for feature items empowerment, with the inverted file indexing, with the cosine angle between the distance measurement, with recall and precision evalu
Platform: | Size: 713728 | Author: Peng Jin | Hits:

[.netkps_2.2_SP5

Description: K风是由Kwindsoft自主研发的专业网页搜索引擎系统,拥有先进的智能分析和海量数据检索技术,核心由多线程采集系统、智能分析系统、海量索引系统、全文检索系统四大部分构成。系统采用专业级的搜索引擎系统架构,支持海量数据毫秒级全文检索。主要面向大中型行业搜索引擎、地方搜索引擎、专类信息搜索引擎等应用领域设计的专业全文检索产品,为用户提供海量数据全文检索应用的理想解决方案。 -K Wind is the professional Web search engine system by Kwindsoft independent research and development, advanced intelligent analysis and mass data retrieval techniques constitute the core consists of four major parts of multithreaded acquisition system, intelligent analysis system, mass indexing system, full-text search system. The system uses professional-grade search engine system architecture to support full-text retrieval of massive data millisecond. Professional full-text retrieval products mainly for large and medium-sized industry search engine, local search engines, and class information on search engine applications designed to provide users with massive data retrieval applications ideal solution.
Platform: | Size: 3010560 | Author: sdgfdg508 | Hits:

[AI-NN-PRInverted-Indexing-for-Text-retrieval

Description: Web search is the quintessential large-data problem. Given an information need expressed as a short query consisting of a few terms, the system s task is to retrieve relevant web objects (web pages, PDF documents, PowerPoint slides, etc.) and present them to the user. How large is the web? It is dicult to compute exactly, but even a conservative estimate would place the size at several tens of billions of pages, totaling hundreds of terabytes (considering text alone). In real-world applications, users demand results quickly from a search engine|query latencies longer than a few hundred milliseconds will try a user s patience. Ful lling these requirements is quite an engineering feat, considering the amounts of data involved!
Platform: | Size: 598016 | Author: mhk | Hits:

[Consolefindinformation

Description: 数据挖掘,智能信息检索,建立索引的时间较久,索引建立后文档查词很快-Data mining, intelligent information retrieval, indexing time longer, document indexing search words soon after
Platform: | Size: 4096 | Author: | Hits:

CodeBus www.codebus.net